perm filename OVERV[DIS,DBL]3 blob
sn#215673 filedate 1976-05-20 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00014 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00003 00002 .NSECP(Overview)
C00004 00003 .SSEC(Three-page Summary of the Project)
C00006 00004 . SSSEC(Detour: Analysis of a discovery)
C00010 00005 . SSSEC(What AM does: Syntheses of discoveries)
C00018 00006 . SSSEC(Results)
C00024 00007 . SSSEC(Conclusions)
C00026 00008 .SSEC(Ways of viewing AM as some common process)
C00028 00009 . SSSEC(AM as Hill-climbing)
C00032 00010 . SSSEC(AM as Heuristic Search)
C00044 00011 . SSSEC(AM as a Mathematician)
C00050 00012 .SSEC(Fifteen-page Summary of the entire project)
C00052 00013 .SSEC(Guide to reading the remainder of the thesis)
C00054 00014 . SSSEC(Varied Readership of this thesis)
C00059 ENDMK
C⊗;
.NSECP(Overview)
.SSEC(Three-page Summary of the Project)
<<Edit this, making it follow the general outline of the thesis>
Scientists often face the difficult task of formulating research
problems which must be soluble yet nontrivial. In any given branch
of science, it is usually easier to tackle a specific given problem
than to propose interesting yet managable new questions to
investigate. For example, contrast ⊗4solving⊗* the Missionaries and
Cannibals problem with the more ill-defined reasoning which led to
⊗4inventing⊗* it.
This thesis is concerned with creative theory formation in
mathematics: how to propose interesting new concepts and plausible
hypotheses connecting them. The experimental vehicle of my research
is a computer program called ⊗2AM⊗*$$ The original meaning of this
mnemonic name has been abandoned. As Exodus states: I ↓_AM_↓ that I
↓_AM_↓. $ which carries out some of the activities involved in
mathematical research: noticing simple relationships in empirical
data, formulating new definitions out of existing ones, deciding
carefully what to explore next, evaluating the overall worth of new
concepts.
. SSSEC(Detour: Analysis of a discovery)
Before discussing how to ⊗4synthesize⊗* a new theory, consider
briefly how to ⊗4analyze⊗* one, how to construct a plausible chain of
reasoning which terminates in a given discovery. One can do this by
working backwards, by reducing the creative act to simpler and
simpler creative acts. For example, consider the concept of prime
numbers. How might one be led to define such a notion? Notice the
following plausible strategy:
.ONCE INDENT 9,9,9 SELECT 6
"If f is a function which transforms elements of A into elements of
B, and B is ordered, then consider just those members of A which are
transformed into ⊗4extremal⊗* elements of B. This set is an
interesting subset of A."
When f(x) means "divisors of x", and the ordering is "by length",
this heuristic says to consider those numbers which have a minimal$$
The other extreme, numbers with a MAXIMAL number of factors, was also
proposed by AM as worth investigating. This led AM to many
interesting questions. See Appendix {[2]MAXDIV}. $ number of factors
-- that is, the primes. So this rule actually ⊗4reduces⊗* our task
from "proposing the concept of prime numbers" to the more elementary
problems of "discovering ordering-by-length" and "inventing
divisors-of".
But suppose we know this general rule: ⊗6"If f is an interesting
function, consider its inverse."⊗* It reduces the task of discovering
divisors-of to the simpler task of discovering multiplication$$ Plus
noticing that multiplication is associative and commutative. $.
Eventually, this task reduces to the discovery of very basic notions,
like substitution, set-union, and equality. To explain how a given
researcher might have made a given discovery, such an analysis is
continued until that inductive task is reduced to "discovering"
notions which the researcher already knew, which were his conceptual
primitives.
. SSSEC(What AM does: Syntheses of discoveries)
Suppose a large collection of these heuristics has been assembled
(e.g., by analyzing a great many discoveries, and writing down new
heuristic rules whenever necessary). Instead of using them to
⊗4explain⊗* how a given idea might have evolved, one can imagine
starting from a basic core of knowledge and "running" the heuristics
to ⊗4generate⊗* new concepts.
Such syntheses are precisely what AM does. The program consists of a
large corpus of primitive mathematical concepts, each with a few
associated heuristics$$ Situation/action rules which function as
local "plausible move generators". Some suggest tasks for the system
to carry out, some suggest ways of satisfying a given task, etc. $.
AM's activities all serve to expand AM itself, to enlarge upon a
given body of mathematical knowledge. To cope with the enormity of
the potential "search space" involved, AM uses its heuristics as
judgmental criteria to guide development in the most promising
direction. It appears that the process of inventing worthwhile new$$
Typically, "new" means new to AM, not to Mankind; and "worthwhile"
can only be judged in hindsight. $ concepts can be guided
successfully using a collection of a few hundred such heuristics.
Each concept is represented as a frame-like data structure with 25
different facets or slots. The types of facets include: ⊗6Examples,
Definitions, Generalizations, Domain/Range, Analogies,
Interestingness,⊗* and many others. The ⊗6Beings⊗* representation
provides a convenient scheme for organizing the heuristics; for
example, the following strategy fits into the ⊗4Examples⊗* facet of
the ⊗4Predicate⊗* concept: "If, empirically, 10 times as many
elements ⊗4fail⊗* some predicate P, as ⊗4satisfy⊗* it, then some
⊗4generalization⊗* (weakened version) of P might be more interesting
than P". AM considers this suggestion after trying to fill in
examples of any predicate$$ In fact, after AM attempts to find
examples of SET-EQUALITY, so few are found that AM decides to
generalize that predicate. The result is a new predicate which means
"Has-the-same-length-as" -- i.e., a rudimentary precursor to Numbers.
$.
AM is initially given a collection of 115 core concepts, with only a
few facets filled in for each. Its sole activity is to choose some
facet of some concept, and fill in that particular slot. In so
doing, new notions will often emerge. Uninteresting ones are
forgotten, mildly interesting ones are kept as parts of one facet of
one concept, and very interesting ones are granted full concept
status. Such new ⊗6Beings⊗* have dozens of blank parts, hence the
space of possible actions (blank slots to fill in) grows rapidly.
The same heuristics are used both to suggest new directions for
investigation, and to limit attention: both to grow and to prune.
. SSSEC(Results)
The particular mathematical domains in which AM operates depend on
the choice of initial concepts. Currently, AM begins with nothing
but a scanty knowledge of concepts which Piaget might describe as
⊗4prenumerical⊗*: Sets, substitution, operations, equality, and so
on. In particular, AM is not told anything about proof,
single-valued functions, or numbers. With this basis, AM quickly
discovered$$ "Discovering" a concept means that (1) AM recognized it
as a distinguished entity (e.g., by formulating its definition) and
also (2) AM decided it was worth investigating (either because of the
interesting way it was formed, or because of surprising preliminary
empirical results). $ elementary numerical concepts (corresponding to
those we refer to as natural numbers, multiplication, factors, and
primes) and wandered around in the domain of elementary number
theory. Although it was never able to ⊗4prove⊗* the unique
factorization theorem, AM actually did ⊗4conjecture⊗* it.
AM was not able to discover any "new-to-Mankind" mathematics purely
on its own, but ⊗4has⊗* discovered several interesting notions
hiterto unknown to the author, and has inspired one small new result
in number theory.↑1 A synergetic AM--human combination can sometimes
produce better research than either could alone.
Everything that AM does can be viewed as testing the underlying body
of heuristics. Gradually, this knowledge becomes better organized,
its implications clearer. The resultant body of detailed heuristics
may be the germ of a more efficient programme for educating math
students than the current dogma$$ Currently, the educator takes the
very best work any mathematician has ever done, polishes it until its
brilliance is blinding, and presents it to the student to induce
upon. Many individuals (e.g., Knuth and Polya) have pointed out this
blunder. A few (e.g., Papert at MIT, Adams at Stanford) are
experimenting with more realistic strategies for "teaching"
creativity. $.
Another benefit of actually constructing AM is that of
⊗4experimentation⊗*: one can vary the concepts AM starts with, vary
the heuristics available, etc., and study the effects on AM's
behavior. Several such experiments were performed. One involved
adding a couple dozen new concepts from an entirely new domain: plane
geometry. AM busied itself exploring elementary geometric concepts,
and was almost as productive there as in its original domain. New
concepts were defined, and new conjectures formulated$$ A cute result
was obtained there: any angle (between 0 and 180↑o) can be
approximated to within one degree, as the sum of two angles drawn
from the set α{0↑o, 1↑o, 2↑o, 3↑o, 5↑o, 7↑o, 11↑o,..., 179↑oα}, i.e.,
the set of angles of prime size. $. Other experiments indicated that
AM was more robust than anticipated; it withstood many kinds of
"de-tuning". Others demonstrated the tremendous impact that a few
key concepts (e.g., Equality) had on AM's behavior. Several more
experiments have been planned for the near future.
. SSSEC(Conclusions)
AM is forced to judge ⊗4a priori⊗* the value of each new concept, to
lose interest quickly in concepts which aren't going to develop into
anything. Often, such judgments can only be based on hindsight. For
similar reasons, AM has difficulty formulating new heuristics which
are relevant to the new concepts it creates. Heuristics are often
merely compiled hindsight. While AM's "approach" to empirical
research may be used in other scientific domains, the main limitation
(reliance on hindsight) will probably recur. This prevents AM from
progressing indefinitely far on its own.
This ultimate limitation was reached. AM's performace degraded more
and more as it progressed further away from its initial base of
concepts. Nevertheless, AM demonstrated that selected aspects of
creative discovery in elementary mathematics could be adequately
represented as a heuristic search process. Actually constructing a
computer model of this activity has provided an experimental vehicle
for studying the dynamics of plausible empirical inference.
.SSEC(Ways of viewing AM as some common process)
This section will provide a few metaphors: some hints for squeezing
AM into paradigms with which the reader might be familiar. For
example, the existence of heuristics in AM is quite similar to the
presence of domain-specific information in any knowledge-based
system.
Consider assumptions, axioms, definitions, and theorems to be
syntactic rules for the language that we call Mathematics. Thus
theorem-proving, and the whole of textbook mathematics, is a purely
syntactic process. Then the heuristic rules used by a mathematician
(and by AM) would correspond to the semantic knowledge associated
with these more formal methods.
Just as one can upgrade natural-language-understanding by
incorporating semantic knowledge, AM is only as successful as the
heuristics it knows.
Three more ways of "viewing" AM as something else will be provided:
(i) AM as a hill-climber, (ii) AM as a heuristic search program, and
(iii) AM as a mathematician.
. SSSEC(AM as Hill-climbing)
Let's draw an analogy between developing mathematics and the familiar
process of hill-climbing. We may visualize AM as exploring a space
using an evaluation function which imparts to it a topography.
Consider AM's core of very simple knowledge. By compounding the
known concepts and methods, AM can extend this foundation a little
wherever it wishes. The incredible variety of alternatives to
investigate includes all known mathematics, much trivia, countless
deadends, and so on. The only "successful" paths near the core are
the narrow ribbons of known mathematics (perhaps with a few other
undiscovered slivers).
How can AM walk through this immense space, with any hope of
following the few, slender branches of already-established
mathematics (or some equally successful new fields)? AM must do
hill-climbing: As new concepts are formed, decide how promising they
are, always explore the currently most-promising new concept. The
evaluation function is quite nontrivial, and this research may be
viewed as an attempt to study and explain and duplicate the
judgmental criteria people employ. Attempts at codifying such
"mysterious" emotive forces as intuition, aesthetics, utility,
richness, interestingness, relevance... indicated that a large but
not unmanagable collection of heuristic rules should suffice.
The important visualization to make is that with proper evaluation
criteria, AM's planar mass of interrelated concepts is transformed
into a breath-taking relief map: the known lines of development
become mountain ranges, soaring above the vast flat plains of trivia
and inconsistency below.
Occasionally an isolated hill is discovered near the core;$$ E.g.,
Conway's numbers, as described in Knuth's ↓_Surreal Numbers_↓ $
certainly whole ranges lie undiscovered for long periods of time$$
E.g., non-Euclidean geometries $, and the terrrain far from the
initial core is not yet explored at all.
. SSSEC(AM as Heuristic Search)
As the title of this section -- and this thesis -- proclaims, AM is a
kind of "heuristic search" program. That must mean that AM is
exploring a particular "space," using some informal evaluation
criteria to guide it.
The flavor of search which is used here is that of progressively
enlarging a tree. Heuristics are used to decide which node of the
tree to expand next, and to produce from that node a few interesting
successor nodes. To do mathematical research well, I claim that it is
necessary and sufficent to have good methods for proposing new
concepts from existing ones, and for deciding how interesting each
candidate (concept, node) is.
AM explores mathematics by selectively enlarging itself: AM ⊗4is⊗* a
body of mathematical knowledge (concepts, plus the wisdom to use them
effectively). To see this, we must explain what the nodes of AM's
search space are, what the operators or links are, where the
heuristic information comes into play, and what the evaluation
function is.
AM's space can be considered to consist of all nodes which are
consistent, partially-filled-in concepts. That is, a primitive "legal
move" for AM would be to (i) enlarge some facet of some concept, or
(ii) create a new, partially-complete concept. Consider momentarily
the size of this space. Since there is no constraint on what the new
concepts can be, and no informal knowledge for quickly finding
entries for a desired facet, a blind "legal-move" program would go
nowhere -- slowly! One shouldn't even call the activity such a
program would be doing "math research."
The heuristic rules are used as little "plausible move generators".
They suggest what facet of what concept to enlarge next, and they
suggest specific new concepts to create. The only activities which AM
will consider doing are those which have been motivated for some
specific good reason. The validity of that last statement of course
depends on the validity of the heuristic rules. This is the
programmer's responsibility.
AM has a definite algorithm for rating the nodes of its space.
Namely, the heuristic rules provide enough information to
meaningfully order the tasks on the agenda list. Yet AM has no
specific goal criteria: it can't quit just because a dynamite task
has been proposed. AM goes on forever$$ Technically, forever is about
100,000 list cells and a couple cpu hours. $.
Consider Nilsson's description of depth-first searching, and of
breadth-first searching. He has us maintain a list of "open" nodes.
Repeatedly, he plucks the top one and expands it. In the process,
some new nodes may be added to the Open list. In the case of
depth-first searching, they are added at the top; the next node to
expand is the one most recently created; the Open-list is being used
as a push-down stack. For breadth-first search, new nodes are added
at the bottom; they aren't expanded until all the older nodes have
been; the Open-list is used as a queue. For heuristic search, or
"best-first" search, they are evaluated in some numeric way, and then
"merged" into the already-sorted list of Open nodes.
.ONCE TURN ON "{}"
This process is very similar to the ⊗4agenda⊗* mechanism AM uses to
manage its search. This will be discussed in detail in Chapter
{[2]AGENDA}. The agenda is a list of plausible tasks for AM to do,
plus supporting reasons for each task. When a task is suggested for
some reason, it is added to the agenda. A task may be suggested
several times, for different reasons. A global priority value is
assigned to each task, based on the combined value of its reasons.
The control structure of AM is simply to select the task with the
highest priority, execute it, and select a new one. The "agenda"
appears to be a very well-suited data structure for managing a
"best-first" search process.
Similar control structures were used in Dendral$$ The "Predictor"
part of DENDRAL. See [MI4 ref.]. $, SIMULA [reference], and KRL
[reference]. The main difference is that in AM, symbolic reasons are
used (albeit in trivial token-like ways) to decide whether -- and how
much -- to boost the priority of a task when it is suggested again.
There are several difficulties and anomalies in forcing AM into the
heuristic search paradigm. AM's heuristics are used as plausible
move generators; if they are all removed, AM would have nothing to
do. In traditional heuristic searches, the heuristics are separate
from a "legal move generator", hence could all be eliminated: they
merely help constrain a single generator.
Another anomaly is that the operators which AM uses to enlarge and
explore the space of concepts are themselves mathematical concepts
(e.g., some heuristic rules result in the creation of new rules;
"Compose" is both a concept and an operation which results in new
concepts). Thus AM should be viewed as a mass of knowledge which
enlarges ⊗4itself⊗* repeatedly. As far as I know, all previous
computer programs kept the information they "discovered" quite
separate from the knowledge they used to make discoveries$$ Of course
this is typically because the two kinds of knowledge are very
different: For a chess-player, the first kind is "good board
positions," and the second is "strategies for making a good move."
So-called "learning programs" typically learn one specific task, not
how to become better learners. etc. $.
Perhaps the greatest difference between AM and typical heuristic
search procedures is that AM has no well-defined target concepts or
target relationships. Rather, its "goal criteria" -- its sole aim --
is to maximize the interestingness level of the activities it
performs, the priority ratings of the top tasks on the agenda. It
doesn't matter precisely which definitions or conjectures AM
discovers -- or misses -- so long as it spends its time on plausible
tasks. There is no fixed set of theorems that AM should discover, so
AM is not a typical ⊗4problem-solver⊗*. There is no fixed set of
traps AM should avoid, and no winning/losing behavior, so AM is not a
typical ⊗4game-player⊗*.
For example, no stigma is attached to the fact that AM never
discovered real numbers$$ There are many "nice" things which AM
didn't -- and can't -- do: e.g., devising ↓_geometric_↓ concepts from
its initial simple set-theoretic knowledge. See the discussion of
the limitations of AM, Section {[2]DIFSECNUM}.{[2]DIFSSECNUM}. $; it
was rather surprising that AM managed to discover ⊗4natural⊗*
numbers! Even if it hadn't done that, it would have been
acceptable$$ Acceptable to whom? Is there really a domain-invariant
criterion for judging the quality of AM's actions? See the
discussions in Section {[2]EVALU}.1. $ if AM had simply gone off and
developed ideas in set theory.
. SSSEC(AM as a Mathematician)
Before diving into the innards of AM, let's take a moment to discuss
the totality of the mathematics which AM carried out. Like a
contemporary historian summarizing the work of the Babylonian
mathematicians, we shan't hesitate to use current terms and criticize
by current standards.
AM began its investigations with scanty knowledge of a few
set-theoretic concepts (sets, equality of sets, set operations).
Most of the obvious set-theory relations (e.g., de Morgan's laws)
were eventually uncovered; since AM never fully understood abstract
algebra, the statement and verification of each of these was quite
obscure. AM never derived a formal notion of infinity, but it
naively established conjectures like "a set can never be a member of
itself", and procedures for making chains of new sets ("insert a set
into itself"). No sophisticated set theory (e.g., diagonalization)
was ever done.
After this initial period of exploration, AM decided that "equality"
was worth generalizing, and thereby discovered the relation
"same-size-as". "Natural numbers" were based on this, and soon most
simple arithmetic operations were defined. Since addition arose as an
analog to union, and multiplication as an analog to cross-product, it
came as quite a surprise when AM noticed that they were related
(namely, N+N=2xN). AM later re-discovered multiplication in three
other ways. One of these was as repeated addition. Exponentiation
was defined as repeated multiplication. Unfortunately, AM never found
any obvious properties of exponentiation, hence lost all interest in
it.
Soon after defining multiplication, AM investigated the process of
multiplying a number by itself: squaring. The inverse of this turned
out to be interesting, and led to the definition of square-root.
AM remmined content to play around with the concept of
integer-square-root. Although it defined the set of numbers which had
no square root, AM was never close to discovering rationals, let
alone irrationals. Raising to fourth-powers, and fourth-rooting,
were discovered at this time. Perfect squares and perfect
fourth-powers were isolated.
.ONCE TURN ON "{}"
The associativity and commutativity of multiplication indicated that
it could accept a BAG of numbers as its argument. This led to the
notion of factoring a number. Minimally-factorable numbers turned out
to be what we call primes. Maximally-factorable numbers were also
thought to be interesting.
AM conjectured the fundamental theorem of arithmetic (unique
factorization into primes) and Goldbach's conjecture (every even
number >2 is the sum of two primes) in a surprisingly symmetric way.
The unary representation of numbers gave way to a representation as a
bag of primes (based on unique factorization), but AM never thought
of exponential notation. Since the key concepts of remainder,
greater-than, and exponentiation were never mastered, progress in
number theory was arrested.
.SSEC(Fifteen-page Summary of the entire project)
<<This section is not written yet. Sorry. >
<potential organization: mirror the overall organization of the thesis itself>
Include the following points on Motivation (why is this worthwhile?):
.B
Inherent interest of getting a handle on the task (sci. creativity)
Personal belief that discovery can be (ought to be) demystified
Potential for learning, from the system, more about the process
of sci. concept formation, thy. formation, chance discovery
(do experiments on the implementations: eg, vary AM's heurs)
Potential usefulness of the implementations themselves (including AM)
Aids to research; i.e., ultimately: new discoveries.
Potential to education: like Mycin, extract heurs. and teach them
All the usual bad reasons:
"Look ma, no hands" + maternal drives + ego + thesis drives +...
Historical:
Need task with no specific goal, to test BEINGs ideas.
Disenchantment with theorem-provers that plod along, in contrast
to the processes which my model of math demands: intu, need,
aesth., multiple reprs, proposing vs proving, fixed task.
.E
.SSEC(Guide to reading the remainder of the thesis)
<<This guide is not written yet. Sorry. >
.B
i) Overall organization of the thesis
ii) Plans for what to read (and in what order), depending on your interests
Plan for those interested in the AI ideas
Plan for those interested in the systems ideas
Plan for those interested in mathematics
iii) Pre-requisites and how to satisfy them, for each chapter
For those with little pure mathematics in their background
For those with little computer science background
For computer scientists with little contact to AI before
<either organized by "type" of reader, or by chapter/section>
.E
. SSSEC(Varied Readership of this thesis)
This thesis -- and its readers -- must come to grips with a very
interdisciplinary problem. For the reader whose background is in
Artificial Intelligence, most of the system's actions -- the
"mathematics" it does -- may seem inherently uninteresting. For the
mathematician, the word "LISP" signifies nothing beyond a speech
impediment (to Artificial Intelligence types it also connotes a
programming impediment). If I don't describe "LISP" the first time I
mention it, a large fraction of potential readers will never realize
that potential. If I ⊗4do⊗* stop to describe LISP, the other readers
will be bored.
.ONCE TURN ON "{}"
In an attempt not to lose readers due to jargon, two glossaries of
terms have been compiled. Appendix {[2]GLOS}.1 contains capsule
descriptions of about 100 mathematical terms, ideas, notations, etc.
Appendix {[2]GLOS}.2 renders the analogous service for Artificial
Intelligence jargon and computer science concepts.